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WHAT IS CLAIMED IS: 

1. A multi-processor system comprising: a plurality of node 
groups each including a plurality of nodes and a service processor 
for managing said plurality of nodes; a service processor manager 
for managing said service processors of said plurality of node 

5 groups; a network for interconnecting said plurality of nodes of 
said plurality of node groups, and a partition including a selected 
number of nodes selected from said plurality of nodes of said 
plurality of node groups, wherein: 

a failed node among said selected number of nodes 

10 transmits failure information including occurrence of a failure to a 
corresponding service processor, which prepares first status 
information of said failed node based on error log information of 
said failed node and transmits said first status information to said 
service processor manager; 

15 said failed node transmits failure notification data including 

said failure information to other nodes of said selected number of 
nodes; 

said other nodes transmit said failure information to 
respective said service processors, which prepare second status 
20 information based on error log information of said other nodes 
and transmit said second status information to said service 
processor manager; and 

said service processor manager identifies a location of said 
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failed node based on said first and second status information to 
indicate said service processors in said partition to recover from 
said failure. 

2. The multi-processor system according to claim 1, wherein 
said failed node transmits a failure notification packet including 
said failure notification data to said other nodes through said 
network. 

3. The multi-processor system according to claim 2, wherein 
said failure notification packet has destination addresses 
specifying said other nodes. 

4. The multi-processor system according to claim 2, wherein 
said failure notification packet is transmitted by broadcasting to 
said plurality of nodes of said plurality of node groups, and said 
other nodes of said selected number of nodes fetch therein said 
failure notification packet based on partition information of said 
failed node. 

5. The multi-processor system according to claim 2, wherein 
said failed node transmits said failure information through a 
communication channel different from a communication channel 
used for an ordinary transaction. 
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6. The multi-processor system according to claim 1, wherein 
said service processors and said service processor manager are 
connected together via a dedicated communication line. 

7. The multi-processor system according to claim 1, wherein if 
said corresponding service processor judges that said failure is a 
minor error, said corresponding service processor isolates said 
failed node from said partition. 

8. The multi-processor system according to claim 1, wherein 
said service processor manager indicates said service processors 
in said partition to reset said partition in synchrony with one 
another. 

9. A method for recovering from a failure in a multi-processor 
system including: a plurality of node groups each including a 
plurality of nodes and a service processor for managing said 
plurality of nodes; a service processor manager for managing said 
service processors of said plurality of node groups; a network for 
interconnecting said plurality of nodes of said plurality of node 
groups, and a partition including a selected number of nodes 
selected from said plurality of nodes of said plurality of node 
groups, said method comprising the steps of : 

transmitting failure information including occurrence of a 
failure from a failed node among said selected number of nodes to 



a corresponding service processor, thereby allowing said 
corresponding service processor to prepare first status information 
of said failed node based on error log information of said failed 
node and transmit said first status information to said service 
processor manager; 

transmitting failure notification data including said failure 
information from said failed node to other nodes of said selected 
number of nodes; 

transmitting said failure information from said other nodes 
to respective said service processors^ thereby allowing said 
service processors to prepare second status information based on 
error log information of said other nodes and transmit said second 
status information to said service processor manager; and 

allowing said service processor manager to identify a 
location of said failed node based on said first and second status 
information and indicate said service processors in said partition 
to recover from said failure. 



